Evolution of Bit Strings: Some Preliminary Results

نویسندگان

  • Harald Freund
  • Robert Wolter
چکیده

In t his paper we show some pr eliminary resu lts of simulat ions wit h a population of bit st r ings . We present t he ideas of our model and po int out the difficulti es in imp lem enting t hem . Alt hough an open evolut ion is one of the final aims , we ca n present some interesting resu lts wit h fini t e and even very small systems. One impor tant resu lt is t he evidence t hat introducing start codons in the decod ing scheme is reasonable. T his seems to have biologica l relevance since star tcodons ar e used in na t ural decoding processes. 1. Int roduction 1.1 General overview During the last few years, at tempts to mod el systems wit h evolutionary properties have become more and more num erous. T he fun dam ental quest ion one wants to answer is essent ially : What are the basic and necessary ingredients that make evolution from simple building blocks to large and complex structures possible, likely, or even inevitable? The whole evolut ion from single-cell bact eria to human beings dep ends on a very elaborate mechanism for self-reproduction. But it is generally believed now that even the first self-reproducing molecules had a rather complicated st ru cturethey were RNA-like-which made their appear an ce by chance rather improbable. So there remains a gap between a pr imordial soup and the first self-reproducing molecules, and even more so between these and primitive organisms like bacteria . There are var ious approaches t ry ing to solve this puzzle. They include Eigen 's hyp ercycle [1], autocat alyt ic networks [2], the Coreworld simulat ions of Rasmussen et al. [3], and Font an a 's AlChemy [4]. ' Part of this work was done while part icipat ing in the Institu te for Scientific Interchange workshop on "Complexity and Evolut ion." tE lectronic mail address: f reund<Ophys ik .uni wupper t al. dbp .de © 1991 Complex Systems Publications, Inc. 280 Harald Freund and Robert Wolt er The guiding pr inciple in specifying our model was that it should be as simple as possible, yet incorporate some features that we consider essent ial: 1. Individuals should be built up from several bui lding blocks in order to pr ovide a huge state space du e to combinatorial explosion. 2. There should be no frequ ency-independ ent fitness, that is, no absolute goa l. 3. Selection should play at least par ti ally on individual building blocks in order to have "mild" mu tations, where changes in one block leave most of the score of an individual unchanged . 4. We want to model a discret e stochas t ic system, not a sys tem of differential equat ions. In an infinite system noise could be suppressed by the law of large numbers, and we could have limit cycles or chaot ic attractors . In our finite systems we must approach some invari ant distribution, though this might be reached so extremely slowly that it may be irrelevan t . 1.2 OUf model We chose a populat ion of fixed-length bit st ringst hey can be viewed as DNA-like molecules having only two inst ead of four bases or some very small molecules directly decoded by a bit st ring-swimming in a well stirred soup . This way interact ions between two of them can be simulated by taking them at random with equal probability. For simplicity, t hese bit st rings will be called "animals" in the following. For each animal we define a real variable storing the score (or fitn ess) of that animal. We should st ress here that this "score" or "fitness" is not a pr especified global fun cti on , because the fitness of each animal is determined by its interact ions with ran dom ly chosen ot hers . So there is only a "populat ion-dependent" fitn ess that cont inually changes as the population chan ges. The pr esent population , considered as a point in the state space of all possible populati ons, is the local environment for each animal in which its local fitness is evaluated in a pr especified way. T he following two element s are necessar y to create a sequence of generations : • An interaction scheme that changes the scores of the animals tak ing part in the interaction ; det ails are given in sect ion 2.1. • An updating or reprodu ct ion scheme that uses the scores to decide which an imals go on to the next generation and which drop out and are replaced by new ones. The reproduction involves err ors produced by mutations; details are given in section 2.2. Implementi ng these two points leads to problems that are of either a tec hnical or a conceptual nature. Let us explain what we mean by techni cal versus conceptua l. Evolution of Bit Strings 281 Mut ation that can be implemented in different ways is used to add new mat erial to the population during the reproducti on phase. We apply singlepoint mutation , that is, "single-bit flip" (except in sect ion 5 where we add "cut and splice" [5]), when an animal pro duces "offspring ." This is simple to implement because it t akes only one random number. There are other possibilit ies that cause on average the same mutation rat e [6] . But as the main feature of mutation is to introduce new material , the results do not depend mu ch on how it is done. We thus claim that how mutation is implement ed is only a technic al probl em. Whether to use mutation as a source of variability or other genetic operators such as inversion , replica t ion, or the parasit e model [7], is more a concep t ual problem. The same is val id for sexual reproduction, which can employ crossover, assortive mating, recomb ination [8], and cut and splice, as described in sect ion 5.3. The decoding scheme that is pr esented in det ail in sect ion 3 is also a conceptual pr oblem. If decoding is done in an unsuited way our final goal-a "t rue" evolutio n-can never begin. We should point out , however , that for many feat ures of an evolving system one cannot decide right from the start whether an implemental alte rnative is merely a technical or a conceptual problem . 2. Interaction and reproduction 2.1 The interaction scheme Each time ste p of the simulat ion consists of two parts: first an interaction ph ase that calculates the scores of all anima ls, and thereaft er a reproduction ph ase as describ ed in section 2.2. The interact ion phase first sets all scores to zero and then executes a number of "bas ic interactions." A basic int eraction takes two animals i and k purely at random , selects one gene in each of them (gi and gk)again at randomand pe rforms the following updat e of the scores Si and Sk: Si <---Si +A(gk, gi) Sk <---Sk+ A (gi ,gk) (2.1) where A denot es a random matrix with values taken from the int erval [-1, +1]. The number of bas ic int eractions was typi cally chosen to be a mul tiple, called a, of the number of animals. Most simulat ions were done with a = 10. Since two animals part icipate in one basic int eract ion , each animal takes par t in 20 interacti ons on average . But we also cond ucted simulations with a = 20, 30, and 40. 2.2 R eproduction In all simulat ions we implemented synchro nous reproducti on throughout t he population. Let us consider som e alte rnat ives: 282 Harald Freund and Robert Wolter 1. One simp le reproduction scheme calculates the average score and removes all animals that are below average and replaces them by mutating survivors . Notice that this imp lies t he somewhat "unbiological" feature that well adapted animals do not die, but survive without being changed by mutations. But this mechanism is easy to implement , needs no sort ing, and is carried out in t ime O(n) where n is the number of animals in the simulation. 2. Another scheme we implemented rep laces a constant fraction of the population by mutants afte r every t ime step (i.e., in every reproduction phase) . We chose the worst quarter of the population to be replaced. The new animals are mutants from (a) the best animals according to score (we choose from t he best quarter) , (b) all sur viving an imals, and (c) all an imals, even from t hose that will be replaced . One arg ument for case (c) is the fact that even a mutant that is at first unsuccessful cou ld cont ain som e genet ic material that proves useful in the future ( "hopeful monsters") . Nevertheless we favored (a) because we have another poss ibility to pr eserve neutral genetic material (see sect ion 3.3) and we wanted to simulate a fairly small population (64 or 128 ind ividuals) for reasons of computational economy. Another aspe ct of our model is the probability P(Si) for an animal i with score Si to be chosen for reproduction. Very often a Fermi distribution of the form ( ) _ constant PSi 1 + exp[(Si s)/C] (2.2) is taken. Here , s = (liN) . 2:I:1 s, denotes the average score and C is a temperature-like parameter. We used C = 0, which chooses all ani mals above average with equal probability. In addition we rep laced s by s, where s denotes a value that is smaller than the highest quarter of all scores but strictly larger than the rest (correspo nd ing to case (a) above). We favored this case because it lets us use simple, equally distributed random numbers. Another advantage of this implementation is that we can easily estimate the number of animals nt that die afte r a certain number of t ime ste ps t if the animals die purely at random. Explicit calculations ar e given in appendix A. 3. Decoding the bit str ings We subdivided the bit strings into functional gro ups that we call "genes." The name genes was chosen for convenience and should not be taken too literally. The concept of genes is closely connec ted wit h the distinction of genotype and phe notype , as sketched in figur e 1. The genotype is the set of Evolution of Bit Strings 283 populations (interaction) ----+ ph enotyp e ----+ development (cha racter fitness of animal) 1__-----' genotype (set of genes) F igure 1: Genotyp e-phenotyp e relation. genes stored in the DNA. This determines the phenotyp e, which in turn determines the fitness. The fitness we can ascribe to an animal is not evaluated by looking at the genotype; it follows indirectl y via the phenotype. As a consequence the genotyp e-fitn ess relation is very nonlinear and unpredictab le. In our model we have no explicit distinction between genotype and phenotype . The genes can interact direct ly via the interaction matrix A as described in section 2.1. T his matrix implicitly contains the genephenotyp e map . T he randomness of the values of the matrix elements is suppose d to mod el the complexity and unpredictabi lity of the gene ---+ phenotype ---+ fitn ess map. The genes are decoded using a binary t ree. To read ~it st ring from any prescrib ed posit ion forward , one starts at the root and moves down the t ree while the bits in the animal decide which way to goO t urns left , 1 turns right and stops if a leaf (which contains the number of the gene) is encountered . This procedure is cont inued until the whole bit st ring of an animal is decoded. In most simulations we worked wit h a rando mly generated t ree that is shown in table 1 and in figur e 2. It decodes 20 genes wit h lengths between 3 and 8 bits. The main reason for using a binary t ree instead of a randomly generated list of genes (t hat can also be list ed in a table like tab le 1) is t ha t this way we are able to decode every possible bit st ring without ambiguity. Each path t hrough the binary tree leads to a gene afte r at most 8 left-right decisions. There is neither arb itrariness (one sequence ---+ many interpret at ions) nor uninterpret ab le sequences. In addit ion this tree allows a simple extension of the model where t he tree is able to grow: A leaf can be replaced by a node that adds two genes to the tree in the next level, so new and longer genes can be int rodu ced and effect ively cha nge the environment the animals are living in. But don 't forget that even wit hout a growing tree we have no static environment . Even with a given represent ation there are different ways to decode a bit st ring. We describe some of them in detail in the following subsections . 3 .1 G enes packed in highest density The largest number of funct ional groups in a bit st ring can be obtain ed by joining the beginning and t he end of the animal , creat ing a circular 284 Harald Freund and Robert Wolt er Gene Length Bit repr esentati on A 3 001 B 3 011 C 3 110 D 3 111 E 4 0000 F 4 0100 G 4 0101 H 4 1000 I 4 1010 J 4 1011 K 5 10011 L 6 000100 M 6 000101 N 6 000110 0 6 000111 P 6 100100 Q 8 10010100 R 8 10010101 S 8 10010110 T 8 10010111 sc 3 110 Tab le 1: Bit representa tion of the genes and of the startcodon (see sect ion 3.3) that were used in all simula t ions. T he startcodon is denoted by "sc." sequence. A gene starts at every position ; that is, an an imal with 32 bits stores 32 genes. In this case the genes overlap maximally and we see st rong correlations between genes. So the int eraction between animals is not the sum of ind ependent interactions between single genes, but of interact ions between groups of genes. This is also known as the hi tchh iker effect . For a detailed example see figur e 3. Simul ati on results are pr esented in section 5.1. 3 .2 G ene b eside gene (a) The next possibility avoids overlap s of functional groups by scanning t he bit st ring from the beginning to the end and start ing a new gene just afte r t he previous one is com plet ed. T he last bi t s, which belong to no gene, are not interprete d . An example is shown in figure 4. A point mutati on at t he end of an animal, decoded this way, will have a small affect on t he genet ic conte nts. The same is t rue for a mu tati on at the beginning of the st ring if this does not change the reading frame. But Evolution of Bit Strings Figure 2: Bit representation of the genes that were used in all simula tions, shown as a tree (as mentioned in section 3). Bits Gene 0 0 0 1 0 0 L 0 0 1 A 0 1 0 0 F F igure 3: Bit string decod ed as described in sect ion 3.1. In this example the genes start at each position in the bi t st ring so the density of genes is maximal. The substring "000100" is always decod ed as the sequ ence "LAF ," which is followed by a gene that starts with "100" (compar e table 1 and figur e 2). 285

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Drill string Vibration Modeling Including Coupling Effects

Abstract: The governing equations of motion for a drill string considering coupling between axial, lateral and torsional vibrations are obtained using a Lagrangian approach. The result leads to a set of non-linear equations with time varying coefficients. A fully coupled model for axial, lateral, and torsional vibrations of drill strings is presented. The bit/formation interactions are assumed ...

متن کامل

Counting Dependent and Independent Strings

We derive quantitative results regarding sets of n-bit strings that have different dependency or independency properties. Let C(x) be the Kolmogorov complexity of the string x. A string y has α dependency with a string x if C(y) − C(y | x) ≥ α. A set of strings {x1, . . . , xt} is pairwise α-independent if for all i 6= j, C(xi) − C(xi | xj) ≤ α. A tuple of strings (x1, . . . , xt) is mutually α...

متن کامل

Chapter 19 Kolmogorov Complexity

Kolmogorov complexity has intellectual roots in the areas of information theory, computability theory and probability theory. Despites its remarkably simple basis, it has some striking applications in Complexity Theory. The subject was developed by the Russian mathematician Andrei N. Kolmogorov (1903–1987) as an approach to the notion of random sequences and to provide an algorithmic approach t...

متن کامل

Evolution of Bit Strings II: A Simple Model of Co-Evolution

In this pap er we present the results of simple co-evolut ionary models that simulate the tempora l development of a populati on of interact ing bit strings . Each bit st ring is decoded into funct ional groups called "genes." The indi vidu als participate in a pro cedure similar to t he Darwini an principle, that is, random int eraction (in which the genes det ermine the fitness of an ind ivid...

متن کامل

Quantum Bit Strings and Prefix-Free Hilbert Spaces

We give a mathematical framework for manipulating indeterminate-length quantum bit strings. In particular, we define prefixes, fragments, tensor products and concatenation of such strings of qubits, and study their properties and relationships. The results are then used to define prefix-free Hilbert spaces in a more general way than in previous work, without assuming the existence of a basis of...

متن کامل

Optimizing and Matching Bitstrings

Strings of bits (bitstrings) are a common first-order representation in the design and preliminary investigation of computational intelligence algorithms given (1) the ease in mapping the strings to arbitrary domains (such as real numbers), and (2) in terms of mathematical analysis. This work considers the use of bitstrings in the context of the clonal expansion and antigenic selection in a gen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Complex Systems

دوره 5  شماره 

صفحات  -

تاریخ انتشار 1991